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Abstract 

A  new  strategy  is  presented  for  large  scale  mathematical  programming. 
Several  specific  applications  are  described  and  computational  results  are 
cited.   These  applications  of  the  BOXSTEP  strategy  fall  in  the  conceptual 
continuum  between  steepest  ascent  methods  and  outer  approximation  methods. 
BOXSTEP  is  able  to  capture  the  best  features  of  both  of  these  extremes 
while  at  the  same  time  mitigating  their  bad  features. 
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1.   Introduction 

This  paper  presents  a  very  simple  idea  that  has  unified  some  previously 
separate  areas  of  theory  and  has  produced  some  surprising  computational  ad- 
vantages.  The  idea  can  be  stated  as  follows.   Suppose  that  we  want  to  maxi- 
mize a  concave  function  v(y)  over  a  convex  set  Y.   Let  B  denote  a  "box" 
(i.e.,  hyper-cube)  for  which  Y  (1  B  is  non-empty.   Let  y  be  a  point  at 

which  v(y)  achieves  its  maximum  over  Y  H  B.   If  y  lies  in  the  interior  of 

* 
the  box,  then  by  the  concavity  of  v,  y  must  be  globally  optimal.   If,  on 

the  other  hand,  y  lies  on  the  boundary  of  the  box,  then  we  can  translate 
B  to  obtain  a  new  box  B'  centered  at  y  and  try  again.   By  "try  again"  we 
mean  maximize  v  over  Y  H  B'  and  check  to  see  if  the  solution  is  in  the  in- 
terior of  B'.   This  intuitive  idea  is  developed  rigorously  in  section  2. 
Note  immediately  that  we  presuppose  some  appropriate  algorithm  for  solving 
each  local  problem.   This  "appropriate  algorithm"  is  embedded  in  a  larger 
iterative  process,  namely  maximizing  v  over  a  finite  sequence  of  boxes. 
Computational  advantage  can  be  derived  if  each  local  problem  with  feasible 
region  Y  H  B  is  significantly  easier  to  solve  than  the  global  problem  with 
feasible  region  Y. 

The  problems  that  we  have  in  mind  are  those  where  v(y)  is  the  optimal 
value  of  a  sub-problem  (SPy)  that  is  parameterized  on  y.   Thus  v(y)  is  not 
explicitly  available  and  evaluating  it  at  y  means  solving  (SPy).   This 
arises  in  the  context  of  decomposition  methods  for  large  scale  mathematical 
programs  and  it  was  in  this  context  that  the  BOXSTEP  idea  was  developed. 
We  begin  by  presenting  it  in  a  more  general  context  so  as  to  facilitate  its 
application  to  other  kinds  of  problems,  e.g.,  non-linear  programs  where 
v(y)  is  explicitly  available.   In  this  case  BOXSTEP  bears  some  resemblance 
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to  the  Method  of  Approximation  Programming  (MAP)  originally  proposed  by 
Griffith  and  Stewart  [   ]  and  recently  revived  by  Beale  [   ]  and  Meyer 
[  ]. 

Section  2  presents  the  BOXSTEP  method  in  very  general  terms  and 
proves  its  convergence.   Section  3  shows  how  an  Outer  Linearization/ 
Relaxation  scheme  can  be  used  to  solve  each  local  problem.   In  this  form 
BOXSTEP  falls  between  the  feasible  directions  methods  at  one  extreme  and 
outer  approximation  or  cutting  plane  methods  at  the  other  extreme.   One 
can  obtain  an  algorithm  of  either  type,  or  something  "in  between",  by 
simply  adjusting  the  size  of  the  box.  Sections  4,  5,  and  6  contain  spe- 
cific applications  to  large  structured  linear  programs.   Section  7  relates 
BOXSTEP  to  other  recent  developments  in  large  scale  optimization  and 
points  out  some  very  promising  directions  for  additional  research. 


-3- 


2.   The  BOXSTEP  Method 

BOXSTEP  is  not  a  completely  specified  procedure  but  rather  a  method 
of  replacing  a  single  difficult  problem  by  a  finite  sequence  of  simpler 
problems.   These  simpler  problems  are  to  be  solved  by  an  appropriate  algo- 
rithm.  This  "appropriate  algorithm"  may  be  highly  dependent  on  problem 
structure  but  by  assuming  its  existence  and  convergence  we  can  establish 
the  validity  of  the  overall  strategy.   In  this  section,  therefore,  we 
present  a  general  statement  of  the  BOXSTEP  method,  prove  its  finite  e- 
optimal  termination,  and  discuss  some  modifications  of  the  basic  method 
which  will  not  upset  the  fundamental  convergence  property. 

Consider  any  problem  of  the  form 

(P)       max  v(y)  ,   with  Y  c:  r"  and  vtY-R  . 
ySY 

If,  for  yeY  and  P  >  0,  the  local  problem 

P(y;  P)     min  v(y)     s.t.     lly  -  ylL  ^  3 
yeY 

is  considerably  easier  to  solve,  either  initially  or  in  the  context  of  a 

reoptimization,   then  (P)  is  a  candidate  for  the  BOXSTEP  method. 

BOXSTEP  Method 

Step  1:    Choose  y  eY,  e  s  0,  P  >  0.   Let  t  =  1. 

Step  2:   Using  an  appropriate  algorithm,  obtain  an  e-optimal  solution 
of  P(y  ;  g),  the  local  problem  at  y  .   Let  y    denote  this 
solution. 

Step  3;    If  v(y   )  ^  v(y  )  +  e,  stop.   Otherwise  let  t  =  t  +  1  and  go 
to  Step  2. 

The  BOXSTEP  mnemonic  comes  from  the  fact  that  at  each  execution  of 
Step  2  the  vector  y  is  restricted  not  only  to  be  in  the  set  Y  but  also  in 


a  box  of  size  2p  centered  at  y  and  the  box  steps  toward  the  solution  as  t  is 
incremented.   The  appeal  of  this  restriction  springs  both  from  heuristics  and 
empirical  observations  which  are  discussed  below.   In  essence,  these  results 
indicate  that  3  ~  +*  >  which  corresponds  to  solving  problem  (P)  all  at  once, 
is  not  an  optimal  choice.   Notice  that  the  stopping  condition  at  Step  3  is 
based  on  the  objective  function  rather  than  on  y    being  in  the  interior  of 
the  box.   This  is  necessary  because  v(y)  may  be  flat  on  top,  in  which  case 
we  might  never  obtain  an  interior  solution. 

The  simplicity  of  the  concept  would  indicate  that  the  convergence 
of  any  algorithm  is  not  upset  when  embedded  in  the  BOXSTEP  method.   This 
is  formally  verified  in  the  subsequent  theorem  for  which  we  need  the  fol- 
lowing definitions: 

6  s  max  ||x-  yH^   . 
x,yeY 


X  =     min  [p/6,  l) 


and 


* 
v  =  max  v(y) . 

yeY 

Theorem:   If  Y  is  a  compact  convex  set  and  v  is  an  upper  semi -continuous 

concave  function  on  Y,  then  the  BOXSTEP  method  will  terminate  after  a  finite 

number  of  steps  with  a  2e/X  optimal  solution. 

Proof:   First  we  establish  that  the  method  terminates  finitely.   If  e  >  0, 

then  non- termination  implies  that  v(y   )  ^  v(y  ) +  te  and,  therefore, 

lim  sup  v(y  )  =  ".   This  contradicts  the  fact  that  an  upper  semi-continuous 

function  achieves  its  maximum  on  a  compact  set. 

If  e  =  0,  then  either  jly  -  y   lloo  =  ^  ^°^  ^^^^   ^   °^  lly  "  ^   Hm  *^  ^ 

li  'p   T+l|| 
for  some  T.   If  ||y  -  y   Hod  "^  ^  then  the  norm  constraints  are  not  binding 
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T+1      * 
and,  by  concavity,  may  be  deleted.   This  implies  that  v(y    )  =  v  and 

termination  must  occur  on  the  next  iteration.   If  ||y  -  y   ||   =  P  for  each 

t,  then,  without  termination,  v(y   )  >  v(y  )  for  each  t.   If  l|y  -  y  |1„  "^  3/2 

for  any  s  >  t,  then  this  would  contradict  the  construction  of  y    as  the 

maximum  over  the  box  centered  at  y   (because  c  =  0) .   Therefore  ||y  -  y  H  ^  P/2 

for  all  s  >  t  and  for  each  t.   This  contradicts  the  compactness  of  Y.   Hence 

the  method  must  terminate  finitely. 

T+1       T 
When  termination  occurs,  say  at  step  T,  we  have  v(y   )  ^  v(y  )  +  e. 

Let  y  be  any  point  such  that  v(y  )  =  v  .   Then,  by  concavity, 

v((l  -  X)y'^  +  \y*^  ^   (1  -  >^)v(y'^) +Xv(y*). 
Now  the  definition  of  X   implies  that 

l|(l-X)y^  +  \y*-y^l|^  ^  3  , 

T+1 
and  by  the  construction  of  y    it  follows  that 


v(y^''"^)  +  e  s  v((l  -X)y'^  +  Xy*)  s  (1  -  X)v(y^)  + 


Xv 


Therefore,  since  termination  occurred, 


v(y^)+2e^  (1  -  X)v(y^) +Xv'' 


Hence, 


XvCy"^)  +  2e  s  Xv  . 


v(y^)  ^  V*-  2e/X. 


Q.E.D. 

This  robust  convergence  result  requires  at  least  one  qualification. 
With  the  exception  of  the  case  where  v  is  piecewise  linear,  there  are  not 
many  situations  in  which  it  is  clearly  evident  that  Step  2  can  be  per- 
formed in  a  finite  number  of  steps  when  e  =  0  and  the  choice  of  e  >  0  be- 
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comes  mandatory.   This  touches  upon  the  whole  range  of  problems  related 
to  speed  of  convergence  and  numerical  stability  in  the  execution  of  Step 
2  but,  due  to  the  intentional  vagueness  of  the  term  "appropriate",  it  is 
not  possible  to  discuss  these  except  in  the  context  of  a  specific  appli- 
cation. 

One  important  modification  of  the  basic  method  that  can  be  intro- 
duced without  upsetting  convergence  is  a  line  search.  When  far  from  the 
solution,  it  is  possible  to  view  BOXSTEP  as  an  ascent  method  which  uses 
Step  2  as  a  procedure  employing  more  than  strictly  local  information  in 
the  determination  of  the  next  direction  of  search,  d  =  y    -  y  .  As 
stated  above,  a  step  of  size  one  is  taken  in  this  direction.   Depending 
on  the  structure  of  (P),  it  may  be  beneficial  to  do  an  exact  or  approxi- 
mate line  search  maximizing  v(y  +9(y    -y  ))  over  feasible  values  of 
9  s  1.   This  can  be  incorporated  in  Step  3  in  the  natural  way  without 
disturbing  the  statement  of  the  theorem. 

It  should  be  emphasized  that  the  interpretation  of  BOXSTEP  as  an 
ascent  method  does  not  apply  in  the  neighborhood  of  a  solution.  Near 
termination,  the  BOXSTEP  method  becomes  a  restricted  version  of  the  algo- 
rithm chosen  to  execute  Step  2. 

The  BOXSTEP  method  is  an  acceleration  device  which  should  encourage 
the  exploitation  of  problem  structure.   For  example,  successive  executions 
of  Step  2  involve  problems  which  are  highly  related  and  information  from 
one  solution  can  be  used  in  the  next  iteration.   The  well-known  advantages 
of  reoptimization  should  be  exploited  if  possible.   These  advantages  and 
other  important  computational  considerations  are  discussed  in  the  context 
of  the  more  specific  applications  contained  in  the  subsequent  sections. 
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3.   Implementation:   solving  the  local  problem  by  outer  approximation. 

We  now  specify  that  for  the  remainder  of  this  paper  Step  2  of  the 
BOXSTEP  method  is  to  be  executed  with  an  outer  approximation  (cutting 
plane)  algorithm.   Thus,  in  Geoffrion's  framework  [  ],  each  local  prob- 
lem will  be  solved  by  Outer  Linearization/Relaxation. 

Both  V  and  Y  can  be  represented  in  terms  of  the  family  of  their 
linear  supports.   Thus 

(3.1)  v(y)  =  min  (f^  +  g\) 

keK 

(3.2)  Y  =  {y  e  r"  I  p^  +  q\  s  0   for  j  e  jj 

where  J  and  K  are  index  sets,  p  and  f  are  scalars,  and  q  and  g  eR  . 

These  are  such  that 

k   k 

a)  for  each  keK  there  is  a  y  e  Y  with  v(y)  =  f  +g  y;  and 

b)  for  each  j  e  J  there  is  a  y  e  Y  with  p  +q  y  =  0. 

In  the  applications  presented  in  sections  4,  5,  and  6  the  function  v(y) 
represents  the  optimal  value  of  a  subproblem  that  is  parameterized  on  y. 
The  set  Y  contains  all  points  y  for  which  (SPy)  is  of  interest.  When 
yeY  the  algorithm  for  (SPy)  produces  a  linear  support  for  v  at  y,  but 

if  y  ({  Y  it  produces  a  constraint  that  is  violated  at  y.   Thus  in  the 

k*      k* 
former  case  we  get  f   and  g   such  that 

(3.3)  v(y)  =  f   +g  y 

i  ■>'<■         i  * 

while  in  the  latter  case  we  get  p   and  q   such  that 

(3.4)  pJ  +q^  y  <  0. 

Given  the  representations  in  (3.1)  and  (3.2),  the  local  problem  at 
any  y  c  Y  can  be  written  as 
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max 

a 

y,'^ 

sub^ 

ject 

to 

f^  + 

k 

g  y 

s 

a 

P^ 

q^y 

s 

0 

P(yS  3) 


for  k  e  K 
for  j  e  J 

y.  -P^y.  ^y.+3     for  i  =  1, .  . .  ,  n  . 
1       11 

Let  P(y  ;  P)  denote  P(y  ;  p)  with  J  replaced  by  a  finite  J  c:  j  and  K 
replaced  by  a  finite  K  c;  K.   Thus  P(y  ;  3)  is  a  linear  program  and  a  re- 
laxation of  P(y  ;  P).   The  outer  approximation  algorithm  for  Step  2  of 
the  BOXSTEP  method  can  then  be  stated  as  follows. 
Step  2a:   Choose  an  initial  J  and  K. 
Step  2b;   Solve  the  linear  program  P(y  ;  3)  to  obtain  an  optimal  solution 

(y,  5). 

i*     i* 
Step  2c;   If  yeY,  continue  to  2d.   Otherwise  determine  p   and  q   such  that 

i*   !*'> 
p-^  +q-'  y  <  0. 

Set  J  =  J  U  {j*}  and  go  to  2b. 

k*      k* 
Step  2d:   Determine  f   and  g   such  that 

v(y)  =  f   +g  y  . 

If  v(y)  <  a  -  e,  set  K  =  K  U  [k*}  and  go  to  2b. 

Step  2e;   Done;  (y,  a)  is  e-optimal  for  P(y  ;  P). 

The  convergence  properties  of  this  procedure  are  discussed  in  Zangwill  [  ] 
and  in  Luenberger  [  ] . 

It  is  not  necessary  to  solve  every  P(y  ;  3)  to  completion  (i.e.,  e- 
optimality) .   If  a  tolerance  A  >  0  is  given  and  if  at  Step  2d  we  find  that 

(3.5)     v(y)  >  v(y'')+A 
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then  we  could  immediately  move  the  center  of  the  box  to  y  and  set  y    =  y. 
This  would  eliminate  unnecessary  work  far  from  the  global  maximum. 

Step  2b  requires  the  solution  of  a  linear  program.   The  role  of  re- 
optimization  within  a  single  execution  of  Step  2  is  obvious.   Between 
successive  executions  of  Step  2  there  is  an  opportunity  for  reoptimization 
arising  from  the  fact  that  the  constraints  indexed  by  J  and  K  are  valid 
globally.   Thus  at  Step  2a  we  may  choose  the  initial  J  and  K  to  include  any 
constraints  generated  in  earlier  boxes.  An  obvious  choice  is  to  retain  any 
constraint  that  was  binding  in  the  final  optimal  tableau  of  the  preceding 
local  problem. 

The  question  of  whether  or  not  to  carry  over  constraints  from  one 
local  problem  to  the  next  is,  in  fact,  an  empirical  one  and  the  answer  is 
highly  dependent  on  problem  structure.   For  the  application  discussed  in 
section  5  it  was  clearly  best  to  save  old  constraints  and  omit  the  line 
search  between  boxes.   This  was  because  of  the  substantial  computational 
burden  involved  in  reoptimizing   the  subproblem  (SPy) .   In  the  application 
of  section  4,  however,  it  proved  best  to  discard  all  constraints  upon  com- 
pleting a  local  problem  and  to  use  the  line  search  before  placing  the  next 
box. 

In  the  case  where  v(y)  is  the  optimal  value  of  (SPy),  there  is  also 
an  opportunity  to  use  reoptimization   techniques  on  (SPy).   When  this  is 
possible  the  BOXSTEP  method  is  especially  attractive.   This  is  because 
the  successive  points  y  for  which  the  solution  of  (SPy)  is  needed  are  pre- 
vented, by  the  box,  from  being  very  far  apart.   Suppose,  for  example,  that 
(SPy)  is  an  out-of-kilter  problem  whose  arc  costs  depend  on  y.   Reoptimization 
is  then  very  fast  for  small  changes  in  y  but  may  equal  the  solution-f rom-scratch 
time  for  large  changes  in  y. 
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Finally,  the  desire  to  use  the  framework  of  column  generation  for  the 
reoptimization  techniques  at  Step  2b  dictates  that  we  work  on  the  dual  of 

P(y  ;  P).  We  record  this  dual  here  for  future  reference. 

,  .      n,  n, 

(3.6)     minE_  fV  +  S_  pJ^L .  -  E  (y!^  -  g)  6"!"  +  E(y^  +  p)6' 
keK    "   jej    J   1=1^  ^   '     ^    i=r  ^   '      "■ 

s.t.  IL  \  =  1 
keK  '^ 

E_(.g^)\    +   E_(-qJ)^L.  -  16"^  +  16"  =  0 
kcK        jej      J 

X,  ^L,  6"^,  6"  ^  0  . 

The  similarity  of  (3,6)  to  a  Dantzig-Wolfe  master  problem  will  be  commented 
upon  in  the  next  section. 

All  of  the  computational  results  presented  in  the  subsequent  sections 
were  obtained  by  implementing  the  BOXSTEP  method,  as  described  above,  within 
the  SEXOP  linear  programming  system  [   ] . 
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A.   Application:   price-directive  decomposition 
Consider  the  linear  program 

(DW)      min  ex   s.t.   Ax  i  b 
xeX 

where  X  is  a  non-empty  polytope  and  A  is  an  (mxn)  matrix.   This  is  the 
problem  addressed  by  the  classical  Dantzig-Wolfe  decomposition  method  [   ]. 
It  is  assumed  that  the  constraints  determined  by  A  are  coupling  or  "compli- 
cating" constraints  in  the  sense  that  it  is  much  easier  to  solve  the  La- 
grangian  subproblem 

(SPy)     min  ex  +  y(Ax  -  b) 
xeX 

for  a  given  value  of  y  than  to  solve  (DW)  itself.   If  we  let  v(y)  denote 
the  minimal  value  of  (SPy),  then  the  dual  of  (DW)  with  respect  to  the 
coupling  constraints  can  be  written  as 

(4.1)  max  v(y). 
y^O 

Let  -jx   I  k  c  Kr  be  the  extreme  points  and  iz-'    \    j  e  Jj-  be  the  extreme 
rays  of  X.   Then  v(y)  >  -  <»  if  and  only  if  yeY'  where 

(4.2)  Y'  =  |y  e  R^"  I  cz-^  +  yAz-^  >  0   f or  j  e  j| 

and  when  yeY'  we  have 

k      k 

(4.3)  v(y)  =  min  (ex  -I-  yAx  )  -  yb 

keK 

The  set  Y  of  interest  is  the  intersection  of  Y'  with  the  non-negative 
orthant ,  since  (4.1)  specifies  that  y  be  non-negative.   Thus  Y  and  v(y) 
are  of  the  form  discussed  in  section  3,  except  for  the  yb  term  in  (4.3). 
(Note  also  that  y  is  a  row  vector.)   In  this  context  the  local  problem  is 
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(4.4)     max  a  -  yb 
s.t. 


k      k 
a    -   y(Ax  )  ^  ex     for  k  e  K 

-  y(Az-^)  ^  cz-"    for  j  e  J 

-y.  5  -max  jo,  y.  -  pj    for  i  =  1,,..,  m 

y .  ^  y .  +  B   for  i  =  1 , . . .  ,  m 

and  the  problem  solved  at  Step  2b  (see  (3.6))  is 

(4.5)     min  E_(cx  )X,  +  Tj_(cz^)[i  .    -    Emax-^0,  y^  -  B  f  6    f  E(y^+3)67 
keK         jej      J   i=l    ^    1   J   1    ^^^\   1   /  1 


s.t. 

2-  k  =  1 
keK  " 

E_(-Ax'^)X  +  S_(-Az-^)ii.  -  16"^  +  16"  =  -b 
keK  jeJ      ^ 

X,  li,  6  ,  6~  ^  0  . 

If  the  point  y   is  in  fact  the  origin  (y   =  0)  then  the  objective 
function  of  (4.5)  becomes 

(4.6)     E_(cx  )X  +  E_(czJ)m,.  +  SB67 
k€K         jeJ      ^        i=l  ^ 

and  hence  (4.5)  becomes  exactly  the  Dantzig-Wolfe  master  problem.   There 

is  a  slack  variable  6.  and  a  surplus  variable  6.  for  each  row.   Since  the 
1  1 

constraints  were  Ax  s  b  each  slack  variable  has  zero  cost  while  each  sur- 
plus variable  is  assigned  the  positive  cost  p.   The  cost  3  must  be  large 
enough  to  drive  all  of  the  surplus  variables  out  of  the  solution.   In  terms 
of  the  BOXSTEP  method,  then,  Dantzig-Wolfe  decomposition  is  simply  the  solu- 
tion of  one  local  problem  over  a  sufficiently  large  box  centered  at  the 
origin.   The  appearance  of  a  surplus  variable  in  the  final  solution  would 
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indicate  that  the  cost  p  was  not  large  enough,  i.e.,  that  the  box  was  not 
large  enough. 

The  test  problem  that  we  have  used  is  a  linear  program  of  the  form 
(DW)  which  represents  a  network  design  model.  The  matrix  A  has  one  row 
for  each  arc  in  the  network.  The  set  X  has  no  extreme  rays,  hence  Y  is 
just  the  non-negative  orthant.  The  subproblem  (SPy)  separates  into  two 
parts.  The  first  part  involves  finding  all  shortest  routes  through  the 
network.  The  second  part  can  be  reduced  to  a  continuous  knapsack  problem. 
For  a  network  with  M  nodes  and  L  arcs  problem  (DW),  written  as  a  single 
linear  program,  has  M(M  -  1)  +  3L  +  1  constraints  and  2LM  +  3L  variables. 
The  details  of  this  model  are  given  by  Agarwal  [   ] . 

For  this  type  of  problem  the  best  performance  was  obtained  by  solving 
each  local  problem  from  scratch.   Thus  constraints  from  previous  local 
problems  were  not  saved.   A  line  search,  as  indicated  in  section  2,  was 
performed  between  successive  boxes.   This  was  done  with  an  adaptation  of 
Fisher  and  Shapiro's  efficient  method  for  concave  piecewise  linear  func- 
tions [   ]  . 

Table  1  summarizes  our  results  for  a  test  problem  with  M  =  12  nodes 
and  L  =  18  arcs.   The  problem  was  run  with  several  different  box  sizes. 
Each  run  started  at  the  same  point  y   -  a  heuristically  determined  solu- 
tion arising  from  the  interpretation  of  the  problem.   For  each  box  size  p 
the  column  headed  N(3)  gives  the  average  number  of  constraints  generated 
per  box.   Notice  that  this  number  increases  monotonically  as  the  box  size 
increases.   For  a  fixed  box  size,  the  number  of  constraints  generated  per 
box  did  not  appear  to  increase  systematically  as  we  approached  the  global 
optimum.   The  column  headed  T  gives  the  total  computation  time,  in  seconds, 
for  a  CDC6400. 
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Table  1.   Solution  of  network  design  test  problem  by  liOXSTEP  (price 
directive)  with  varying  box  sizes. 


P  (box  size)      no.  of  boxes  required        N(P)     T  (seconds) 


0.1  34 

0.5  18 

1.0  13 

2.0  9 

3.0  6 

4.0  4 

5.0  5 

6.0  4 

7.0  3 

20.0  2 

25.0  2 

30.0  1 

1000.0  1 


12.7 

172 

14.2 

118 

17.1 

104 

17.7 

88 

25.0 

99 

26.8 

76 

33.4 

134 

34.3 

115 

38.0 

119 

67.5 

203 

74.0 

243 

74.0 

128 

97.0 

217 
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The  largest  box  (3  =  1000)  represents  the  Dantzig-Wolfe  end  of  the 
scale.   The  smallest  box  (g  =  0.1)  produces  an  ascent  that  is  close  to 
being  a  steepest  ascent.   A  pure  steepest  ascent  algorithm,  as  proposed 
by  Grinold  [   ],  was  tried  on  this  problem.   With  Grinold's  primal/dual 
step-size  rule  the  steps  became  very  short  very  quickly.   By  taking  op- 
timal size  steps  instead,  we  were  able  to  climb  higher  but  appeared  to 
be  converging  to  the  value  5097.   The  maximum  was  at  5665.   The  poor 
perfoinnance  of  steepest  ascent  is  consistent  with  our  poor  results  for 
very  small  boxes. 
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5.   Application:   resource-directive  decomposition 

The  dual  or  Lagrangian  orientation  of  the  previous  section  is  comple- 
mented by  the  formulation  discussed  in  this  section.   Here  the  function 
v(y)  is  obtained  from  the  application  of  Geoffrion's  primal  resource- 
directive  strategy  [  ]  to  a  large  structured  linear  program. 

A  large-scale  contract  selection  and  distribution  problem  is  fonnu- 
lated  as  a  structured  mixed  integer  linear  programming  problem  by  Austin 
and  Hogan  [   ] .   The  linear  program  consists  of  a  large  single  commodity 
network  with  a  few  resource  constraints  on  some  arcs.   The  integer  por- 
tion of  the  problem  models  a  binary  decision  regarding  the  inclusion  or 
exclusion  of  certain  arcs  in  the  network.   Embedded  in  a  branch-and- 
bound  scheme,  the  bounding  problem  is  always  a  network  problem  with  re- 
source constraints.   Hogan  [   ]  has  applied  a  primal  resource  directive 
strategy  to  such  problems,  resulting  in  a  subproblem  of  the  form 


(SPy) 


min  Z/  c  X 
r  r 

r 


s.t, 


E  X  -  E  X  =  0 


r€B. 

1 


reA. 

1 


for  all  i 


for  r  ^  R 


where 


i,     <  X     -i   min  {h  ,  y  }   for  r  e  R 


is  the  flow  on  arc  r 

is  the  cost  per  unit  of  flow  on  arc  r 
is  the  lower  bound  of  flow  on  arc  r 
is  an  upper  bound  of  flow  on  arc  r 
the  arcs  with  node  i  as  sink 
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A.  :   the  arcs  with  node  i  as  source 

1 

R   :   the  arcs  subject  to  resource  constraints. 

For  any  vector  y,  v(y)  is  the  minimal  value  of  (SPy) .   Note  that  there  is 

one  variable  y  for  each  arc  that  is  resource-constrained.   Let 
•^r 


(5.1)  Y^  =  -fy  I  (SPy)  is  feasiblej  , 

and 

(5.2)  Y^  =  {y  I  E  a.^y^  ^  b.   for  i  =  1, . . . ,  m}. 


2 
Thus  Y   is  the  feasible  region  (in  y-space)  determined  by  the  resource 

1    2 
constraints.   The  set  Y  of  interest  is  then  Y  H  Y  . 

The  piecewise  linear  convex  function  v  can  be  evaluated  at  any  point 
y  by  solving  a  single  commodity  network  problem.   As  a  by-product  we  ob- 
tain a  linear  support  for  v  at  y.   Similarly,  the  implicit  constraints  in 
Y  can  be  represented  as  a  finite  collection  of  linear  inequalities,  one 
of  which  is  easily  generated  whenever  (SPy)  is  not  feasible.   Thus  v(y) 
lends  itself  naturally  to  an  outer  approximation  solution  strategy.   The 
details  are  given  by  Hogan  [  ] . 

An  outer  approximation  algorithm  was,  in  fact,  implemented  for  the 

problem 

(5.3)     min  v(y). 
yeY 

This  algorithm  was  subsequently  embedded  in  the  BOXSTEP  method  with  sub- 
stantial computational  improvement.  To  examine  the  effect  of  varying  the 
box  size,  twenty  five  test  problems  of  the  type  found  in  [   1  were  ran- 
domly generated  and  solved.   The  basic  networks  had  approximately  650 
arcs  of  which  four  important  arcs  were  constrained  by  two  resource  con- 
straints.  In  each  case,  the  problem  was  initialized  with  a  solution  of 
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(SPy)  without  the  resource  constraints  (i.e.,  y  ^  h  )  and  a  randomly 
generated  initial  value  of  y.   The  mean  B6700  seconds  to  solution  are 
recorded  in  Table  2  under  the  column  headed  T, .   Sporadically  available 
results  from  larger  test  problems  indicated  greater  sensitivity  of  solu- 
tion times  to  box  sizes. 

The  theoretic  appeal  of  BOXSTEP  in  this  case  arises  from  three  major 
points  which  can  be  classified  as:   cut  requirements,  reoptimization  of 
the  subproblem,  and  reoptimization  of  the  local  problem. 

Cut  requirements;  The  v  ftinction  is  piecewise  linear  or  polyhedral.   Hence 
its  epigraph,  epi  v,  has  a  finite  number  of  faces.   To  characterize  v  over 
any  subset  of  Y,  the  outer  approximation  method  generates  a  series  of  cuts 
or  linear  supports.   Each  cut  is  coincident  with  at  least  one  face  of  epi  v 
and  no  cut  is  repeated.   The  smaller  the  subset  of  Y  the  smaller  the  number 
of  faces  and,  therefore,  the  smaller  the  number  of  cuts  needed  to  describe 
v.   It  follows  that  the  smaller  the  box,  the  smaller  the  computational 
burden  in  solving  the  local  problem  at  Step  2.   This  has  already  been  demon- 
strated by  the  results  of  section  4. 

Reoptimization  of  the  subproblem:  Once  an  initial  solution  for  the  basic 
network  has  been  obtained,  v(y)  can  be  determined  by  reoptimizing  the  basic 
network  with  changes  in  some  of  the  arc  capacities  (i.e.,  for  r e R) .  If 
these  changes  are  small,  as  they  must  be  when  3  is  small,  these  reoptimi- 
zations  can  be  performed  quickly.  Since  most  of  the  computational  burden 
in  this  problem  is  devoted  to  reoptimizing  the  network,  this  proves  to  be 
a  significant  consideration. 

Reoptimization  of  the  local  problem:   Since  the  generation  of  cuts  requires 
the  relatively  expensive  reoptimization  of  the  subproblem,  some  or  all  of 
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Table  2.  Solution  of  25  resource  constrained  network  problems  by 
BOXSTEP  (resource  directive)  with  varying  box  sizes 

P  (box  size)  T^^  (seconds)               T^  (seconds) 

10^  31.8                        4.1 

10^  23.1                       5.8 

10^  21.2                       14.9 

10^  34.6                       25.0 


T   :  Mean  time  to  solution  using  a  randomly  generated  vector  as  the 
initial  y. 

T2  :   Mean  time  to  solution  using  an  optimal  solution  as  the  initial  y. 
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these  cuts  should  be  retained  as  the  box  moves.   This  greatly  reduces  the 
number  of  cuts  that  must  be  generated  to  solve  each  local  problem  after 
the  first  one.   In  contrast  to  the  application  presented  in  section  4, 
saving  cuts  paid  off  but  the  line  search  did  not.   The  mechanism  for  saving 
cuts  was  very  simple.   Up  to  a  fixed  number  were  accumulated.   Once  that 
number  was  reached  every  new  cut  replaced  an  old  non-basic  cut. 

Although  the  size  of  the  box  is  inversely  related  to  the  effort  re- 
quired to  solve  the  problem  within  the  box,  the  results  indicate  a  trade- 
off between  the  size  of  the  box  and  the  number  of  moves  required  to  solve 
the  overall  problem  (5.3).   There  is  a  notable  exception  to  this  rule  how- 
ever.  Frequently,  if  not  always,  there  is  a  readily  available  prior  esti- 
mate  of  an  optimal  solution  point  y  .  Most  large-scale  problems  have  a 
natural  physical  or  economic  interpretation  which  will  yield  a  reasonable 
estimate  of  y  .   In  this  application,  recall  that  (5.3)  is  actually  the 
bounding  problem  in  a  branch -and -bound  scheme.   The  v  function  changes 
slightly  as  we  move  from  one  branch  to  another.   The  minimizing  point  y 
changes  little  if  at  all.   Thu5  we  wish  to  solve  a  sequence  of  highly  re- 
lated  problems  of  the  form  (5.3).   Using  the  previous  y  as  the  starting 
point  on  a  new  branch  would  seem  quite  reasonable.   Furthermore,  the  size 
of  the  box  used  should  be  inversely  related  to  our  confidence  in  this  esti- 
mate.  To  illustrate  this  important  point,  the  twenty  five  test  problems 
were  restarted  with  the  box  centered  at  the  optimal  solution.   The  mean 
time  required,  in  B6700  seconds,  to  solve  the  problem  over  this  box  is 
recorded  in  Table  2  under  the  column  headed  T„ .   These  data  and  experience 
with  this  class  of  problems  in  the  branch  and  bound  procedure  indicate  that 
starting  with  a  good  estimate  of  the  solution  and  a  small  boxsize  reduces 
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the  time  required  to  solve  (5.3)  by  an  order  of  magnitude  as  compared  to 
the  case  with  g  =  -1-'=  .   Clearly  a  major  advantage  of  the  BOXSTEP  method  is 
the  ability  to  capitalize  on  prior  information. 

Ascent  methods  typically  exploit  prior  information  in  the  form  of  a 
good  initial  estimate  but  the  information  generated  during  the  solution 
procedure  is  not  cumulative.   Outer  approximation  methods,  in  contrast,  do 
not  exploit  prior  information  but  the  cuts  generated  during  solution  are 
cumulative.   The  current  applications  of  the  BOXSTEP  method  fall  in  the  con- 
ceptual continuum  between  these  two  extremes  and  capture  the  best  features  of 
both. 
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6.  Application:   price-direction  revisited 

Some  of  the  resource-constrained  network  problems  introduced  in  sec- 
tion 5  have  also  been  solved  by  price-directive  decomposition  as  developed 
in  section  4.   These  results  will  be  presented  very  briefly.   Using  the 
notation  of  section  5  the  problem  as  given  is 


for  all  i 


(6.1) 

min 

2       C     X 

r   r 

r 

(6.2) 

s, 

,t. 

E     X    - 

reB.      ^ 

1 

reA. 

1 

X 

r 

=     0 

(6.3) 

r          r 

h 
r 

fo; 

(6.4) 

Z)     a.    X 

^  b. 

fo: 

for  all  r 


for  i  =  1, . . .  ,  m 


If  we  dualize  with  respect  to  the  resource  constraints  (6.4),  the 
resulting  subproblem  is  a  network  problem  parameterized  in  its  arc  costs 
rather  than  in  its  capacities. 

Tests  were  made  with  a  network  of  142  nodes  and  1551  arcs.   The  uncon- 
strained solution  (which  took  5  seconds  on  the  CDC6400)  had  positive  flow 
on  only  100  of  the  arcs.   These  100  arcs  were  divided  into  5  sets  of  20 
each  and  5  resource  constraints  were  constructed.   For  the  price-directive 
approach,  then,  y  was  a  5  dimensional  vector.   As  in  section  5  the  best 
performance  was  obtained  by  saving  old  cuts  and  omitting  the  line  search. 

Table   3  contains  the  results  for  a  series  of  runs,  each  starting 
at  the  origin.   The  maximum  number  of  cuts  retained  was  20.   The  time  is 
given  in  CDC6400  seconds.   (For  this  type  of  application  the  CDC6400  is 
between  5  and  6  times  faster  than  the  B6700.)   In  addition,  the  twenty  five 
problems  reported  in  section  5  were  solved  by  pure-directive  decomposition 
on  a  B6700  and  the  counterpart  of  Table  2  is  reported  in  Table  4. 
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Table  3  .   Price -directive  results  for  a  resource-constrained  network  problem. 


BOX  SIZE  (p)  Number  of  boxes  required  T  (seconds) 

.1  ?  >  150 

.5  7  70 

1.0  4  55 

2.0  2  76 

3.0  2  119 

4.0  1  92 

5.0  1  108 

1000.0  1  >  150 


Table  4.   Solution  of  25  resource  constrained  network  problems  by 
BOXSTEP  (price  directive)  with  varying  box  sizes. 


* 
T..  (seconds) 


67.8 
35.1 
17.0 
22.4 
27.1 


p 

(box   size) 

10° 

10^ 

10^ 

10^ 

10^ 

3. 

.7 

4. 

,8 

9, 

.9 

14, 

.9 

17, 

.0 

T-  :  Mean  time  to  solution  using  a  randomly  generated  vector  as  the 


1 


initial  y. 


T„  :  Mean  time  to  solution  using  an  optimal  solution  as  the  initial  y. 
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7.   Conclusion 

We  have  only  scratched  the  surface  as  far  as  applications  of  the 
BOXSTEP  method  are  concerned.   The  main  avenues  for  future  work  appear 
to  be  the  following. 

Structured  linear  programs;  Many  other  problems  for  which  decomposition 
has  failed  in  the  past  need  to  be  re-investigated.   This  is  especially  true 
when  BOXSTEP  is  considered  in  conjunction  with  other  new  developments  in 
large-scale  optimization  (more  on  this  below). 

Structured  non-linear  programs;   BOXSTEP  has  yet  to  be  tried  on  non-linear 
problems.   Two  very  likely  starting  points  are  Geoffrion's  tangential  approxi- 
mation approach  to  the  resource-directive  decomposition  of  non-linear  sys- 
tems [   ]  and  Geoffrion's  generalized  Benders  decomposition  method  [   ]. 
General  non-linear  programs;   In  the  case  where  v(y)  is  an  explicitly  avail- 
able concave  function,  BOXSTEP  could  perhaps  be  used  to  accelerate  the 
convergence  of  any  outer  approximation  or  cutting  plane  algorithm.   This 
has  not  yet  been  tried.   There  may  be  other  kinds  of  algorithms  that  can 
profitably  be  embedded  in  the  BOXSTEP  method. 

Integer  programming;   Geoffrion  [  ]  and  Fisher  and  Shapiro  [   ]  have  re- 
cently shown  how  the  maximization  of  Lagrangian  functions  can  provide  strong 
bounds  in  a  branch-and -bound  framework.   The  BOXSTEP  method  should  find  many 
fruitful  applications  in  this  context.   It  has  the  desirable  property  that 
the  maximum  values  for  successive  boxes  form  a  monotonically  increasing  se- 
quence. 

There  are  also  several  tactical  questions  to  be  investigated.   Is 
there  a  rule-of-thumb  for  the  best  box  size  to  use  for  a  given  problem? 
When  should  cuts  be  saved?  When  is  the  line  search  beneficial?  We  shall 
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confine  ourselves  here  to  one  such  question:   How  should  the  starting 
point  y  be  chosen? 

The  BOXSTEP  method  can  take  advantage  of  a  good  starting  point.   This 
point  may  be  derived  heuristically  from  the  interpretation  of  the  problem, 
as  in  section  4,  or  it  may  be  taken  as  the  optimal  solution  of  a  closely 
related  problem,  as  in  section  5.   Alternatively,  we  can  start  BOXSTEP 
where  some  other  algorithm  leaves  off.   Suppose  that  an  algorithm  with 
good  initial  behavior  but  slow  convergence  is  applied  to  the  given  prob- 
lem.  As  soon  as  this  algorithm  stops  making  satisfactory  progress  a 
switch  can  be  made  to  the  BOXSTEP  method.   This  has  been  tried  with  dra- 
matic success.   BOXSTEP  has  been  coupled  with  the  subgradient  relaxation 
method  recently  introduced  by  Held  and  Karp  [   ]  and  Held  and  Wolfe  [   ]. 
A  sample  of  the  results  will  indicate  the  benefits  that  may  be  derived 
from  this  kind  of  hybrid  algorithm. 

Recall  the  representation  (3.1)  of  v(y)  in  terms  of  its  linear  sup- 

-rr   /  t.    £k(t)  ,   k(t)  t   ^,    .^  .     ,,  ,      ^,  ^   k(t)  . 
ports.   If  v(y  )  =  f^'^+g^'^y,  then  it  is  well-known  that  g  ^  '^  is 

a  subgradient  of  v  at  y  .   Held  and  Wolfe  propose  the  following  iterative 

process  starting  at  any  point  y  : 

,^  ,,      t+1  _  t  ^    k(t) 

(7.1)  y    =  y  +  s^g  '■  '' 

k(t)  t     r   •)°° 

where  g     is  a  subgradient  of  v  at  y  and  is  j    is  a  sequence  for  which 

s  -•  0  but  ^-<r_i  s   =  ".   Any  point  generated  by  this  process  could  be  taken 
as  the  starting  point  for  BOXSTEP.   The  hybrid  algorithm  was  tested  on  the 
p-median  problem  (see  Marsten  [   ]  or  ReVelle  and  Swain  [  ]).   The  con- 
tinuous version  of  this  problem  has  the  form 

n   n 

(7.2)  min  72      Z/c..x. 


i=l  j=l 


ij  ij 
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(7.3) 

E   X      =  p 
j=i    JJ 

(7.4) 

X.   .     ^    X .  . 

3-J               JJ 

(7.5) 

0  ^  X. .    ^    1 

n 

(7.6) 

S      X..    =    1 
j  =  l      '' 

for  all  i  ^  j 
for  all  i,  j 
for  i  =  1, .  .  .  ,  n 


The  price-directive  approach  developed  in  section  4  was  applied  to  this 
problem.   Dualizing  with  respect  to  the  constraints  (7.5)  results  in  a 
Lagrangian  subproblem  for  which  a  very  efficient  solution  recovery  tech- 
nique has  been  devised  by  Blankenship  [   ] .  The  starting  point  for  BOXSTEP 
was  obtained  by  making  250  steps  of  the  process  (7.1).   The  sequence  [s  } 
was  taken  as  five  repetitions  of  40/t  for  each  t  =  1,...,  50.   The  results 
are  given  in  Table  5  .   The  column  headed  p  gives  the  median  sought,  as 
specified  in  (7.3).   The  columns  headed  !,„,-  and  L^^r,  give  the  value  of 

the  Lagrangian  v(y)  after  125  and  after  250  steps.   L    is  the  maximum 
*=   "^     ^ '  ^     max 

value  of  the  Lagrangian.   T^  and  T„  give  the  time  devoted  to  (7.1)  and  to 
BOXSTEP  respectively  in  CDC6400  seconds.   The  last  column  gives  the  num- 
ber of  boxes  used  by  the  BOXSTEP  method.   This  test  problem  had  n  =  33 
and  the  box  size  3=1  was  used  in  each  case.  (For  p  =  2  and  4  an  optimal 
solution  and  zero  subgradient  were  obtained  very  quickly,  making  the  sub- 
sequent steps  and  BOXSTEP  solution  superfluous.)  These  results  indicate 
that  by  using  more  than  strictly  local  information  BOXSTEP  is  able  to  at- 
tain and  verify  the  optimal  solution  quickly.   Hybrid  algorithms  of  this 
type  should  produce  significant  computational  breakthroughs  in  the  very 
near  future. 
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Table   5  .   Results  for  p-median  problem  with  hybrid  algorithm 


L 

2 

3 

4 

8 

9 

10 

11 

20 

30 


^125 

17474.0 

14538.7 

12363.0 

7449.1 

6840.9 

6263.8 

5786.8 

2779.7 

509.8 


"^250 

17474.0 

14622.5 

12363.0 

7454.0 

6843.6 

6265.4 

5786.9 

2785.9 

514.3 


max 

17474.0 

14627.0 

12363.0 

7460.0 

6846.0 

6267.0 

5787.0 

2786.0 

515.0 


_!_ 
9.2 
8.9 
11.8 
13.6 
11.5 
14.9 
15.2 
14.5 
14.6 


-2. 
0.3 
19.6 
0.2 
8.4 
3.3 
0.7 
1.0 
0.9 
1.7 


boxes 

1 

3 
1 
3 
2 
2 
2 
2 
2 
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